Soft Topographic Maps for Clustering and Classifying Bacteria Using Housekeeping Genes
نویسندگان
چکیده
The Self-Organizing Map (SOM) algorithm is widely used for building topographic maps of data represented in a vectorial space, but it does not operate with dissimilarity data. Soft Topographic Map (STM) algorithm is an extension of SOM to arbitrary distance measures, and it creates a map using a set of units, organized in a rectangular lattice, defining data neighbourhood relationships. In the last years, a new standard for identifying bacteria using genotypic information began to be developed. In this new approach, phylogenetic relationships of bacteria could be determined by comparing a stable part of the bacteria genetic code, the so-called “housekeeping genes.” The goal of this work is to build a topographic representation of bacteria clusters, by means of self-organizing maps, starting from genotypic features regarding housekeeping genes.
منابع مشابه
Soft Topographic Map for Clustering and Classification of Bacteria
In this work a new method for clustering and building a topographic representation of a bacteria taxonomy is presented. The method is based on the analysis of stable parts of the genome, the so-called “housekeeping genes”. The proposed method generates topographic maps of the bacteria taxonomy, where relations among different type strains can be visually inspected and verified. Two well known D...
متن کاملTopographic Map of Gammaproteobacteria using 16S rRNA gene sequence
In this work a method for building maps representing topographic visualization of bacteria taxonomy is presented. The method uses bacteria genotype information regarding the 16S rRNA ”housekeeping gene”, that represents a stable parts of bacteria genome. In order to test the proposed method, we consider the Gammaprotebacteria class and we build a sequences dataset with 147 type strains, downloa...
متن کاملMonitoring the Formation of Kernel-Based Topographic Maps with Application to Hierarchical Clustering of Music Signals
When using topographic maps for clustering purposes, which is now being considered in the data mining community, it is crucial that the maps are free of topological defects. Otherwise, a contiguous cluster could become split into separate clusters. We introduce a new algorithm for monitoring the degree of topology preservation of kernel-based maps during learning. The algorithm is applied to a ...
متن کاملWilliam P . Hanage , Christophe Fraser and Brian G . Spratt *
Whatever else they should share, strains of bacteria assigned to the same species should have housekeeping genes that are similar in sequence. Single gene sequences (or rRNA gene sequences) have very few informative sites to resolve the strains of closely related species, and relationships among similar species may be confounded by interspecies recombination. A more promising approach (multiloc...
متن کاملChristophe Fraser and Brian G . Spratt *
Whatever else they should share, strains of bacteria assigned to the same species should have housekeeping genes that are similar in sequence. Single gene sequences (or rRNA gene sequences) have very few informative sites to resolve the strains of closely related species, and relationships among similar species may be confounded by interspecies recombination. A more promising approach (multiloc...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Adv. Artificial Neural Systems
دوره 2011 شماره
صفحات -
تاریخ انتشار 2011